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^Sj . Abstract 

^ ■ Given a function / : {0,1}" — > {0,1}, the f -isomorphism testing problem requires a ran- 

Q\ domized algorithm to distinguish functions that are identical to / up to relabeling of the input 

variables from functions that are far from being so. An important open question in property 
' testing is to determine for which functions / we can test /-isomorphism with a constant num- 

bcr of queries. Despite much recent attention to this question, essentially only two classes of 
functions were known to be efficiently isomorphism testable: symmetric functions and juntas. 
■ We unify and extend these results by showing that all partially symmetric functions — 

\ functions invariant to the reordering of all but a constant number of their variables — are ef- 

^ ■ ficiently isomorphism-testable. This class of functions, first introduced by Shannon, includes 

\ symmetric functions, juntas, and many other functions as well. We conjecture that these func- 

tions are essentially the only functions efficiently isomorphism-testable. 

To prove our main result, we also show that partial symmetry is efficiently testable. In turn, 
to prove this result we had to revisit the junta testing problem. We provide a new proof of 
correctness of the nearly-optimal junta tester. Our new proof replaces the Fourier machinery of 
\l ■ the original proof with a purely combinatorial argument that exploits the connection between 

I sets of variables with low influence and intersecting families. 

Another important ingredient in our proofs is a new notion of symmetric influence. We 
use this measure of influence to prove that partial symmetry is efficiently testable and also to 
construct an efficient sample extractor for partially symmetric functions. We then combine 
the sample extractor with the testing-by-implicit-learning approach to complete the proof that 

partially symmetric functions are efficiently isomorphism-testable, 
r*- ■ 

X 
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1 Introduction 



Property testing considers the following general problem: given a property "P, identify the minimum 
number of queries required to determine with high probability whether an input has the property 
V or whether it is far from V. This question was first formalized by Rubinfeld and Sudan |27j . 

Definition 1 (|27j). Let V he a set of Boolean functions. An e-tester for V is a randomized 
algorithm which queries an unknown function / : {0,1}" — t- {0,1} on a small number of inputs and 

(i) Accepts with probability at least 2/3 when f £ V; 

(ii) Rejects with probability at least 2/3 when / is e-far from V, 

where / is e-far from V if dist{f,g) := \{x G {0, 1}" | /(x) 7^ ^(x)}! > e2" holds for every g €V. 

Goldreich, Goldwasser, and Ron [22] extended the scope of this definition to graphs and other 
combinatorial objects. Since then, the field of property testing has been very active. For an 
overview of recent developments, we refer the reader to the surveys [25^ I26| and the book |21j . 

A notable achievement in the field of property testing is the complete characterization of graph 
properties that are testable with a constant number of queries 0. An ambitious open problem is 
obtaining a similar characterization for properties of Boolean functions. Recently there has been 
a lot of progress on the restriction of this question to properties that are closed under linear or 
affine transformations [U [23] . More generally, one might hope to settle this open problem for all 
properties of Boolean functions that are closed under relabeling of the input variables. 

An important sub-problem of this open question is function isomorphism testing. Given a 
Boolean function /, the f -isomorphism testing problem is to determine whether a function g is 
isomorphic to / — that is, whether it is the same up to relabeling of the input variables — or far 
from being so. A natural goal, and the focus of this paper, is to characterize the set of functions 
for which isomorphism testing can be done with a constant number of queries. 

Previous work. The function isomorphism testing problem was first raised by Fischer et al. [17] . 
They observed that fully symmetric functions are trivially isomorphism testable with a constant 
number of queries. They also showed that every k-junta, that is every function which depends on 
at most k of the input variables, is isomorphism testable with poly(/c) queries. This bound was 
recently improved by Chakraborty et al. [12], who showed that 0{k\ogk) suffice. In particular, 
these results imply that juntas on a constant number of variables are isomorphism testable with a 
constant number of queries. 

The first lower bound for isomorphism testing was also provided by Fischer et al. [T7]. They 
showed that for small enough values of k, testing isomorphism to a A;-linear function (i.e., a function 
that returns the parity of k variables) requires J7(logA;) queries0 Following a series of recent 
works [20^ [8| [9]. the exact query complexity for testing isomorphism to /c-linear functions has been 
determined to be Q{mm{k,n — k)). 

More general lower bounds for isomorphism testing were obtained by O'Donnell and the first 
author jlO| . In particular, they showed that testing isomorphism to any A;-junta that is far from 
being a (A; — l)-junta requires 0(log log k) queries. This lower bound gives a large family of functions 
for which testing isomorphism requires a super-constant number of queries. Alon et al. have shown 
that in fact the query complexity of testing isomorphism is Q{n) for almost every function [4] (see 
also [3 Eg). 

^More precisely, they showed that non-adaptive testers require f2(v^) queries. Here and in the rest of this section, 
tilde notation is used to hide logarithmic factors. 



1 



Partially symmetric functions. As seen above, the only functions which we know are isomor- 
phism testable with a constant number of queries are fully symmetric functions and juntas. Our 
motivation for the current work was to see if we can unify and generalize the results to encompass 
a larger class of functions. While symmetric functions and juntas may seem unrelated, there is in 
fact a strong connection. Symmetric functions, of course, are invariant under any relabeling of the 
input variables. Juntas satisfy a similar but slightly weaker invariance property. For every fc-junta, 
there is a set of at least n — k variables such that the function is invariant to any relabeling of these 
variables. Functions that satisfy this condition are called partially symmetric. 

Definition 2 (Partially symmetric functions). For a subset J C [n] := a function 

/ : {0, 1}" —7- {0, 1} is J-symmetric if permuting the labels of the variables of J does not change 
the function. Moreover, / is called t-symmetric if there exists J C [n] of size at least t such that / 
is J-symmetric. 

Shannon first introduced partially symmetric functions as part of his investigation on the circuit 
complexity of Boolean functions [28] . He showed that while most functions require an exponential 
number of gates to compute, every partially symmetric function can be implemented much more 
efficiently. Research on the role of partial symmetry in the complexity of implementing functions 
in circuits, binary decision diagrams, and other models has remained active ever since |131I24|. Our 
results suggest that studying partially symmetric functions may also yield greater understanding 
of property testing on Boolean functions. 

Our results. The set of partially symmetric functions includes both juntas and symmetric func- 
tions, but the set also contains many other functions as well. A natural question is whether this 
entire class of functions is isomorphism testable with a constant number of queries. Our first main 
result gives an affirmative answer to this question. 

Theorem 1. For every (n — k)-symmetric function f : {0, l}" — )• {0, 1} there exists an e-tester for 
f -isomorphism that performs 0{klogk/e'^) queries. 

A simple modification of an argument in Alon et al. [3] can be used to show that the bound in 
the above theorem is tight up to logarithmic factors. Indeed by this argument, testing isomorphism 
to almost every (n — A;)-symmetric function requires il(/c) queries. 

We believe that the theorem might also be best possible in a different way. That is, we conjecture 
that the set of partially symmetric functions is essentially the set of functions for which testing 
isomorphism can be done with a constant number of queries. We discuss this conjecture with some 
supporting evidence in Section [H 

The proof of our first main theorem follows the general outline of the proof that isomorphism 
testing to juntas can be done in a constant number of queries. The observation which allows us 
to make this connection is the fact that partially symmetric functions can be viewed as junta-like 
functions. More precisely, an (n — A;)-symmetric function is a function that has k special variables 
where for each assignment for these variables, the restricted function is fully symmetric on the 
remaining n — k variables. 

The proof for testing isomorphism of juntas has two main components. The first is an efficient 
junta testing algorithm. This enables us to reject functions that are far from being juntas. The 
second is a query efficient sampler of the "core" of the input function given that the function is 
close to a junta. The sampler can then be used in order to verify if the two juntas are indeed 
isomorphic. We generalize both of these components for partially symmetric functions. 

Our second main result, and the first component of the isomorphism tester, is an efficient 
algorithm for testing partial symmetry. 
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Theorem 2. The property of being (n — k)- symmetric for k < n/10 is testable with 0(|log|) 
queries. 

The natural approach for proving this theorem is to try generahze the result on junta testing 
in [?]• That result heavily relied on the notion of influence of variables. The influence of a set S of 
variables in a function / is the probability that f{x) 7^ f{y) when x is chosen uniformly at random 
and y is obtained from x by re-randomizing the values of Xi for each i £ S. The notion of influence 
characterizes juntas: when / is a A;-junta, there is a set of size n — k whose influence is 0, whereas 
when / is e-far from being a /c -junta, every set of size n — k has influence at least e. 

We introduce a different notion of influence which we call symmetric influence. The symmetric 
influence of a set S of variables in / is the probability that f{x) 7^ /(y) when x is chosen uniformly 
at random and y is obtained from x by permuting the values of {xi}i£s- This notion characterizes 
partially symmetric functions and satisfies several other useful properties. We provide the details 
in Section [3l 

The proof of junta testing also relies on nice properties of the Fourier representation of the 
notion of influence. While symmetric influence has a clean Fourier representation, unfortunately 
it does not have the properties needed to carry over the proof in [7] to the setting of partially 
symmetric functions. Instead, we must come up with a new proof technique. 

Our proof of Theorem [2] uses a new connection to intersecting families. A family of subsets 
of [n] is t- intersecting if for every pair of sets S,T G J^, their intersection size is at least \S 
T\ > t. This notion was introduced by Erdos, Ko, and Rado and a sequence of works led to the 
complete characterization of the maximum size of i-intersecting families that contain sets of fixed 
size [Ml [iHl [30l [2]. Dinur, Safra, and Friedgut recently extended those results to give bounds on 
the biased measure of intersecting families [151 US] • 

Using results in intersecting families, we obtain a new and improved proof for the main lemma 
at the heart of the junta testing result [7j. We describe the new proof and the connection to 
intersecting families in Section [2l Most importantly, the same technique can also be extended to 
complete the proof of Theorem [2j We present this proof in Section [H 

The second and final component of the isomorphism test for partially symmetric functions is 
an efficient way to sample the core of such functions. An (n — A:)-symmetric function /, which 
is symmetric over a set J C [n] of size \J\ = n — k, has a concise representation as a function 
/core ■ {0, l}'^ X {0, 1, . . . , n — fc} — )• {0, 1} which we call the core of /. The core is the restriction of 
/ to the variables in J (in the natural order), with the additional Hamming weight of the variables 
in J. To determine if two partially symmetric functions are isomorphic, it suffices to determine 
whether their cores are isomorphic. We do so with the help of an efficient sample extractor. 

Definition 3. A (1 query) 5-sampler for the (n — A;)-symmetric function / : {0, l}" — > {0, 1} is a 
randomized algorithm that queries / on a single input and returns a triplet {x, w, z) G {0, 1}*^ x 
{0, 1, . . . , n — A;} X {0, 1} where 

• The distribution of (x, w) is (5-close, in total variation distance, to x being uniform over {0, l}'^ 
and w being binomial over {0, 1, . . . , n — A;} independently, and 

• z = fcorci^^w) with probability at least 1 — 6. 

Our third main result is that for any (n — /c)-symmetric function /, there is a query-efficient 
algorithm for constructing a (5-sampler for /. 

Theorem 3. Let f : {0, 1}" — )■ {0, 1} be (n — k)-symmetric with k < n/10. There is an algorithm 
that queries f on 0{A log A) inputs and with probability at least I — rj outputs a 6 -sampler for f. 
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This theorem is a generahzation of a recent result of Chakraborty et al. [TT], who gave a similar 
construction for sampling the core of juntas. Their result has many applications related to testing 
by implicit learning [H]- Our result may be of independent interest for similar such applications. 
We elaborate on this topic and present the proof of Theorem [3] in Section [H 

2 Intersecting families and testing juntas 

We begin by revisiting the problem of junta testing. In this section, we give a new proof of the 
correctness of the /c-junta tester first introduced in [7]. At a high level, the junta tester is quite 
simple: it partitions the set of indices into a large enough number of parts, then tries to identify 
all the parts that contain a relevant variable. If at most k such parts are found, the test accepts; 
otherwise it rejects. The algorithm is described in Junta-Test. (See [7] for more details.) 

Algorithm 1 Junta-Test(/, k, e) 

1: Create a random partition X of the set [n] into r = Qik"^) parts, and initialize J = 0. 

2: for each i = 1 to @{k/e) do 

3: Sample x,y £ {0, 1}" uniformly at random. 

4: if f{x) / fixjuj) then 

5: Use binary search to find a set / S X that contains a relevant variable. 

6; Set J := Jul. 

7: if J is the union of > A; parts then reject. 

8: Accept. 



It is clear that the Junta-Test always accepts /c-juntas. The non-trivial part of the analysis 
involves showing that functions that are far from A: -juntas are rejected by the tester with sufficiently 
high probability. To do so, we must argue that the inequality in Step H] is satisfied with non- 
negligible probability whenever / is far from A;-juntas and J is the union of at most k parts. This 
is accomplished by considering the influence of variables in a function. 

The influence of the set J C [n] of variables in the function / : {0, 1}" — )• {0, 1} is 

Inf;(J) = Pr[/(x) / /(xjyj)] , 

x,y 

where xjyj is the vector z G {0, 1}" obtained by setting Zi = yi for every i £ J and Zj = Xi for 
every i £ [n] \ J. By definition, the probability that the inequality in Step U] is satisfied is exactly 
Infj(J). To complete the analysis of correctness of the algorithm, we want to show that when / 
is e-far from A;-juntas with high probability over the choice of the random partition I, if J is the 
union of at most k parts in I, then Inf(J) > |. We do so by exploiting only a couple basic facts 
about the notion of influence. 

Lemma 1 (Fischer et al. [I7]). For every f : {0, 1}" — )• {0, 1} and every J,K CI [n], 

Inf j( J) < Inf/( J UK)< Inf/( J) + Inf/(K) . 
Furthermore, if f is e-far from k-juntas and \ J\ < k, then Infj(J) > e. 

We also use the fact that the family of sets J C [n] whose complements have small influence form 
an intersecting family. For a fixed i > 1, a family of subsets of [n] is called t- inters eating if any 
two sets J and K m. T have intersection size | Jn iTj >t. Much of the work in this area focused on 
bounding the size of t-intersecting families that contain only sets of a fixed size. Dinur and Safra [H] 
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considered general families and asked what the maximum p-biased measure of such families can be. 
For < p < 1, this measure is defined as fJ-piJ^) ■= Prj[J G J-'] where the probability over J is 
obtained by including each coordinate i G [n] in J independently with probability p. They showed 
that 2-intersecting families have small p-biased measure [H] and Friedgut showed how the same 
result also extends to t- intersecting families for t > 2 |19j . 

Theorem 4 (Dinur and Safra [15j : Friedgut [H]). Let T he a t -intersecting family of subsets of [n] 
for some t > 1. For any p < the p-biased measure of T is bounded by fJ,p{J-) < p*. 

We are now ready to complete the analysis of Junta-Test. 

Lemma 2. Let f : {0, 1}" — )■ {0, 1} be a function e-far from k-juntas and I be a random partition 
of [n] into r = c ■ k"^ parts, for some large enough constant c. Then with probability at least 5/6, 
Infj(J) > e/4 for any union J of k parts from I. 

Proof For < t < ^, let /"^ = {J C [n] : Infj(J) < te} be the family of all sets whose complements 
have influence at most te. For any two sets J,K E the sub-additivity of influence implies that 

Inf/(JrTX) =Inf/(Ju;^) < Inf/(J) +Inf/(K) < 2 • = e . 

But / is e-far from /c -juntas, so every set S C [n] of size \S\ < k satisfies lnif{S) > e. Therefore, 
I J n -ftr| > k and, since this argument applies to every pair of sets in the family, is a (A: + 1)- 
intersecting family. 

Let us now consider two separate cases: when contains a set of size less than 2k; and when 
it does not. In the first case, let J S J~i/2 be one of the sets of size \J\ < 2k. With high probability, 
the set J is completely separated by the partition I. When this event occurs, then for every other 
set K £ \Jr]K\ > k + 1, which means that K is not covered by any union of k parts in I. 

Therefore, with high probability / is |-far (and thus also |-far) from fc-part juntas with respect to 
I, as we wanted to show. 

Consider now the case where J-^1/2 contains only sets of size at least 2k. Then we claim that 
J-1/4 is a 2A;-intersecting family: otherwise, we could find sets J,K £ J^i/i such that | J n X| < 2k 
and Infj(Jni^) < Infj(J) + lnif{K) < |, contradicting our assumption. 

Let J C [n] be the union of k parts in I. Since X is a random partition, J is a random subset 
obtained by including each element of [n] in J independently with probability p = ^ < 2k+i ■ 
Theorem HI 

Pr[Inf/(J) < |] = Pr[J G T^/^] = fi,/r{Ty^) < {k/rf" . 

Applying the union bound over the possible choices of J, we get that / is |-close to a A;-part junta 
with respect to X with probability at most 

3 Symmetric influence 

The main focus of this paper is partially symmetric functions, that is, functions invariant under any 
reordering of the variables of some set J C [n] . Let Sj denote the set of permutations of [n] which 
only move elements from the set J. A function / : {0, 1}" — )• {0, 1} is J-symmetric if f{x) = /(vrx) 
for every input x and a permutation vr G Sj, where irx is the vector whose 7r(i)-th coordinate is Xj. 

For better analyzing partially symmetric functions, we introduce a new measure named symmet- 
ric influence. The symmetric influence of a set measures how invariant the function is to reordering 
of the elements in that set. 
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Definition 4 (Symmetric influence). The symmetric influence of a set J C [n] of variables in a 
Boolean function / : {0, 1}" — {0, 1} is defined as 

Symlnf^(J) = Pr [f{x) + /(vrx)] . 

It is not hard to see that in fact a function / is i-symmetric iff there exists a set J of size t such 
that Symlnfj(J) = 0. A much stronger connection, however, exists between these properties as we 
will shortly describe. 

Before showing some nice properties of symmetric influence, we mention that it also has a simple 
representation using Fourier coefficients of the function. Although we do not use the representation 
in this paper, we feel it might be of independent interest. See Appendix lA.il for details. 

Lemma 3. Given a function f : {0, 1}" — )• {0, 1} and a subset J CI [n], let fj be the J -symmetric 
function closest to f. Then, the symmetric influence of J satisfies 



Proof. For every weight < < n and z G {0, 1}'"^', define the layer := {x G {0, 1}" 



dist(/, fj) < Symlnf^( J) < 2 • dist(/, fj) . 

'■ w < n and z G {0, 1}'"^', define the layer L 
w A xj = z} to be the vectors of Hamming weight w which identify with z over the set J (where 
|z| < u; < I J| + \z\ or otherwise). Let p"^ G [0, ^] be the fraction of the vectors 
in one has to modify in order to make the restriction of f over constant. 

With these notations, we can restate the definition of the symmetric influence of J as follows. 

Symlnf.(J) = V V Pr [x G L^' ] • Pr [/(x) / /(vrx) I x G ] 

z w 

This holds as in each such layer, the probability that x and irx would result in two different outcomes 
is the probability that x would be chosen out of the smaller part and vrx from the complement, or 
vise versa. 

The function / j can be obtained by modifying / at fraction of the inputs in each layer Lj^^ , 
as each layer can be addressed separately and we want to modify as few inputs as possible. By this 
observation, we have the following equality. 



dist(/,/.)=i,5:j:iL^^j.p: 



But since 1 - G [^,1], we have that < 2p^(l - p^) < and therefore dist(/,/j) < 
Symlnfj(J) < 2 • dist(/, /j) as required. □ 

Corollary 1. Let f : {0, 1}" — t- {0, 1} be a function that is e- far from being t-symmetric. Then for 
every set J C [n] of size \J\ > t, Symlnfj(J) > e holds. 

Proof. Fix J C [n] of size \ J\ > t and let g he a J-symmetric function closest to /. Since g is 
symmetric on any subset of J, it is in particular t-symmetric and therefore dist{f,g) > e as / is 
e-far from being t-symmetric. Thus, by Lemma El Symlnfj(J) > dist{ f,g) > e holds. □ 
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Corollary [T] demonstrates the strong connection between symmetric influence and the distance 
from being partially symmetric, similar to the second part of Lemma[T]for influence and juntas. The 
additional properties of influence used in Section [2] are monotonicity and sub-additivity (Lemma [1]). 
The following lemmas show that the same properties (approximately) hold for symmetric influence. 
See Appendices IA.2I and IA.3I for the proofs of both lemmas. 

Lemma 4 (Monotonicity). For any function f : {0, 1}" — t- {0, 1} and any sets J ^ A' C [n], 

Symlnf^(J) < Symlnf^(K) . 

Lemma 5 (Weak sub-additivity). There is a universal constant c such that, for any constant 
< 7 < 1, a function f : {0, 1}" — )• {0, 1}, and sets J, K CI [n] of size at least (1 — j)n, 

SymLif^(JU Jf) < Symlnf^(J) + SymLif^(Jf) + . 

4 Testing partial symmetry 

Let us now return to the problem of testing partial symmetry. The goal of this section is to 
introduce an efficient tester for this property by combining the ideas from Sections [2] and [3l 

We begin by introducing the testing algorithm Partially-Symmetric-Test. This algorithm 
is conceptually very similar to the junta tester in Section [2j Again, the main idea is to partition 
the variables into 0{k'^) parts and identify the parts that contain "asymmetric" variables. More 
precisely, given a function / : {0, 1}" — )• {0, 1}, let us write core(/) C [n] to be the maximum set 
J of variables such that / is J-symmetric. We call the variables in core(/) symmetric and the 
variables in [n] \ core(/) are called asymmetric. The function is (n — /c)-symmetric iff it contains 
at most k asymmetric variables. The algorithm exploits this characterization by trying to identify 
k + 1 parts that contain asymmetric variables. 

Algorithm 2 Par,tially-Symmetric-Test(/, A;, e) 

1: Create a random partition X of [n] into r = 0(/c^/e^) parts, and initialize J := 0. 

2: Pick a random workspace W G I, and if \W\ < ^ then fail. 

3: for each i = 1 to G(A;/e) do 

4: Let / := Find-Asymmetric-Set(/,X, J, W). 

5: if / / then 

6: SetJ:=JU/. 

7: if J is the union of > A; parts then reject. 
8: Accept. 



There are two main differences in the analysis of Partially- Symmetric-Test and of Junta- 
Test in Section [2j The first is that we can no longer use a simple binary search algorithm to 
identify the parts that contain asymmetric variables, as we need to maintain the Hamming weight 
of our queries. To overcome this challenge, we introduce the Find- Asymmetric-Set function, 
which satisfies the following properties. 

Lemma 6. Let f be a function, X he a partition of[n] into r parts, W £ I, \ W\ > he a workspace, 
and J he a union of parts from X\ {W}. Then, there exists an algorithm Find-Asymmetric- 
Set(/, X, J, W) which performs O(logr) queries such that 

• With prohahility Symlnfj(J), the algorithm returns a set I G X\{M^} disjoint to J; otherwise 
it returns ^. 
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• IfW has no asymmetric variable and I & I is returned, then I has an asymmetric variable. 

Due to space constraints, we provide a rough sketch of the algorithm and defer the details and 
analysis to Appendix IB.ll Find- Asymmetric- Set generates a random pair of x G {0,1}" and 
TT € Sj and checks whether /(x) 7^ /(ttx). When this occurs, which happens with probability at 
least e when Symlnfj(J) > e, we know there exists some asymmetric variable in J. In order to 
identify a part I €l, disjoint to J and the workspace W, which contains an asymmetric variable we 
iteratively change x to ttx. In each step, we only permute bits in one part / G X and the workspace 
W. Since f{x) / /(vrx), we can find using binary search a set /, disjoint to J, such that permuting 
bits in I D W changes the value of /. By our assumption, W has no asymmetric variables and 
therefore / must contain such a variable. 

The second and more important challenge in the analysis of Partially-Symmetric-Test is 
the use of symmetric influence (rather than influence). Similar to Lemma [2] for influence, we prove 
that if a function is far from being (n — fe)-symmetric, then it is also far from being symmetric on 
any union of all but k parts of a random partition (assuming it has enough parts). The formal 
statement is given in Lemma [71 where its proof follows a very similar technique to that of Lemma [2j 

Lemma 7. Let f : {0, 1}" — )• {0, 1} be a function e-far from (n — k)-symmetric and I be a random 
partition of [n] into r = c - k'^/e'^ parts, for some large enough constant c. Then with probability at 
least 8/9, Symlnfj(J) > | holds for any union J of k parts. 

The main difference between this proof and the one of Lemma [2] arises from the weak sub- 
additivity of symmetric influence. In light of this difference, our definition of families of sets 
whose complement has small symmetric influence includes only sets which are not too big. We use 
the observation that adding sets which contain elements of a family does not change its existing 
intersection. In addition, due to the additive factor of the sub-additivity we prove a slightly weaker 
result where the symmetric influence is at least e/9 and not e/4. The complete proof of Lemma [7] 
is deferred to Appendix IB. 2[ 

We can now complete the proof that partial symmetry is efficiently testable. 

Proof of Theorem\^ Note that \W\ > ^ indeed holds with probability at least 8/9 from Chernoff 
bound. By Lemma [6l Find-Asymmetric-Set performs 0(log|) queries according to our choice 
of r, and therefore the query complexity of Partially-Symmetric-Test is 0(| log |). 

Suppose / is an (n — A;)-symmetric function. The probability that W contains an asymmetric 
variable is at most k/r < 2/9. Conditioned this did not occur, every set returned by Find- 
Asymmetric-Set contains an asymmetric variable. Since there are at most k such variables, J 
would be the union of at most k sets and we would accept. 

Suppose / is a function e-far from being (n — A;)-symmetric. From Lemma [71 with probability 
at least 8/9, Symlnfy(J) > e/9 holds while J consists of at most k parts. Conditioned on that, by 
executing Find- Asymmetric-Set 0{k/e) times we obtain more than k parts with probability at 
least 8/9, according to Lemma [H Thus, we reject with probability at least 2/3. □ 

5 Isomorphism testing of partially symmetric functions 

In this section we prove that isomorphism testing of partially symmetric functions can be done 
with a constant number of queries. The algorithm we describe consists of two main components, 
and follow a similar approach to the one used in [12] when they showed juntas are isomorphism 
testable. The first, which we already described in Section [H is an efficient tester for the property 
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of being partially symmetric. Once we know the input function is indeed close to being partially 
symmetric, we can verify it is isomorphic (or at least very close) to the correct one. The second 
component of the algorithm is therefore an efficient sampler from the core of a function which is 
(close to) partially symmetric. Comparing the cores of two partially symmetric functions suffices 
to identify if two such functions are isomorphic or far from it. 

Ideally, when sampling the core of a partially symmetric function /, we would like to sample 
it according to the marginal distribution of sampling / at a uniform input x G {0, 1}". We denote 
this marginal distribution over {0, l}'^ x {0, 1, . . . , n — fc} by ^, which is in fact uniform over 
{0, 1}^ and binomial over {0, 1, . . . , n — A;}, independently. 

In our scenario, sampling the core of a function according to this distribution is not possible 
since we do not know the exact location of all the k asymmetric variables. Instead, we use the 
knowledge discovered by the partial symmetry tester, i.e., sets with asymmetric variables. Given 
these sets, we are able to define a sampling distribution over {0, 1}" such that we know the input 
of the core for each query, and whose marginal distribution over the core is close enough to ^. 

Definition 5. Let X be some partition of [n] into an odd number of parts and let G X be the 
workspace. Define the distribution over {0, 1}" to be as follows. Pick a random Hamming 
weight w according to the binomial distribution over {0, . . . , n} and output, if exists, a random 
X € {0, l}" of Hamming weight \x\ = w such that for every I G X \ {Vl^}, either x/ = or = 1. 
When no such x exists, return the all zeros vector. 

The sampling distribution which we just defined, together with the random choice of the par- 
tition and workspace, satisfies the following two important properties. The first, being close to 
uniform over the inputs of the function. The second, having a marginal distribution over the core 
of a partially symmetric function close to T^Xn- These properties are formally written here as 
Proposition [U whose proof is rather technical and appears in Appendix lC.il 

Proposition 1. Let J = {ji, . . . , j^} C [n] he a set of size k, and r = Q{k'^) be odd. If x ^ T)^ 
for a random partition X of [n] into r parts and a random workspace W & I, then 

• x is o{\/n)-close to being uniform over {0, 1}", and 

• (xj, \x-j\) is c/k-close to being distributed according to X*^ for our choice ofO<c<l. 

We are now ready to describe the algorithm for isomorphism testing of (n — A;)-symmetric 
functions. Given an (n — A;)-symmetric function /, the algorithm tests whether the input function 
g is isomorphic to / or e-far from being so. 

Algorithm 3 Partially-Symmetric-Isomorphism-Test(/, A;, ^r, e) 
1: Perform PARTiALLY-SYMMETRlc-TEST(g', /c, e/1000) and reject if failed. 
2: Let X and W € I he the partition and workspace used by the algorithm. 
3; Let J be the union of the k parts identified by the algorithm. 
4; for each i = 1 to @(klogk/e'^) do 
5: Query g{x) at a random x ~ 

6: Accept iff (1 — e/2)-fraction of the queries are consistent with some isomorphism of /, which 
maps the asymmetric variables of / into all k parts of J. 



We provide here a sketch of the analysis of the algorithm. See Appendix IC.2I for the formal 
analysis and complete proof of Theorem [H The first case to analyze is when g is rejected by 
Partially-Symmetric-Test, which implies that with good probability it is not (n — A;)-symmetric 
and in particular not isomorphic to /. Assume now that Partially-Symmetric-Test did not 
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reject and therefore g is likely to be e/lOOO-close to being (n — A;)-partially symmetric. Let I,W 
and J be the partition, workspace and union of k parts identified by the algorithm. The main idea 
of the proof is showing that with good probability, there exists a function h that (a) is e/250-close 
to g, and (b) is (n — fc)-symmetric with asymmetric variables contained in J and separated by I. 
We prove the existence of this function h using the properties of symmetric influence presented in 
Section [H Assuming such h exists, we use Proposition [T] in order to show that our queries to g, 
according to the sampling distribution, are in fact e/lO-close to querying /I's core. 

We now consider the following two cases. If g is isomorphic to /, then for some isomorphism 
/tt of /, which maps the asymmetric variables of / into the parts of J, it holds that dist(/,r)^) < 
dist(/7r, g) + dist((5r, h) < e/500 + e/250. Notice that we cannot assume that 5 = as it is possible 
that one of the asymmetric variables of g are not in J (but the distance must be small). If g was 
e-far from being isomorphic to /, then for every isomorphism of /, 

dist(/^, h) > dist{fn,g) - dist(5r, h) > e - e/250 . 

Given that there are only kl isomorphisms of / we need to consider, performing 0(A;log A;/e^) queries 
suffices for returning the correct answer in both cases, with good probability. 

As we outlined above, we in fact build an efficient sampler for the core of (n — A;)-symmetric 
functions (or functions close to being so). Given the parts identified by Partially- Symmetric- 
Test, assuming it did not reject, we can sample the function's core by querying it at a single 
location, where the distribution over the core's inputs is close to "D^ ^. The algorithm and proof of 
Theorem [3] are deferred to Appendix IC.3I 

6 Discussion 

We showed that every partially symmetric function is isomorphism testable with a constant number 
of queries. It's easy to see that functions that are "close" to partially symmetric can also be 
isomorphism-tested with a constant number of queries. We believe that our result not only unifies 
the previous classes of functions efficiently isomorphism-testable, but that it includes essentially all 
of these functions. 

Conjecture 1. Let f : {0,1}" — )• {0,1} be e-far from {n — k)- symmetric. Then testing f- 
isomorphism requires at least r2(log log A;) queries. 

In fact, we believe that more is true — perhaps even Q,{k) queries are required. But the weaker 
bound (or, indeed, any function that grows with k) is sufficient to complete the qualitative charac- 
terization of functions that are isomorphism-testable with a constant number of queries. 

The known hardness results on isomorphism testing are all consistent with Conjecture [TJ In 
particular, by the result in [5], we know that testing /-isomorphism requires at least Q-{k) queries 
for almost all functions / that are e-far from (re — A;)-symmetric. A simple extension of the proof 
in [To] shows that for every (n — A;)-symmetric function / that is e-far from {n — k + l)-symmetric, 
testing /-isomorphism requires r2(loglogA;) queries (assuming k/n is bounded away from 1). 

Lastly, let us consider another natural definition of partial symmetry that encompasses both 
symmetric functions and juntas. The function / : {0, 1}" — )• {0, 1} is k-part symmetric if there is 
a partition I = {Ii, . . . ,1^} of [n] such that / is invariant under any permutation vr of [n] where 
7r(/j) = Ii for every i = 1, . . . ,k. One may be tempted to guess that fc-part symmetric functions are 
efficiently isomorphism-testable. That is not the case, even when k = 2. To see this, consider the 
function /(x) = xi © X2 © • • • ® This function is 2-part symmetric, but testing isomorphism 

to / requires Q{n) queries [8]. 
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A Properties of symmetric influence 

A.l Fourier representation of symmetric influence 

For convenience, we consider functions whose ranges are {—1, 1} instead of {0, 1}. Then, the 
symmetric influence of a function can be expressed as follows. 

Proposition 2. Given a Boolean function f : {0, 1}" — ?• { — 1, 1} and a set J CI [n], the symmetric 
influence of J with respect to f can also be computed as 

SymInf^(J) = i Yl ^fifi^S)] 

SC[n]'^ ■■' 

where f{S) is the Fourier coefficient of f for the set S C [n], and ttS = {7r(i) | i G S}. 

The proposition indicates that the symmetric influence of any set J can be computed as a 
function of the variance of the Fourier coefficients of the function in the different layers. Each layer 
here refer to all the Fourier coefficients of sets which share the intersection with [n] \ J and the 
intersection size with J, resulting in (| J| + l)2"~l'^l different layers. 

The key to proving this proposition is the following basic result on linear functions. Recall that 
for a set 5 C [n], the function xs ■ {0, 1}" {-1, 1} is defined by xs{x) = (-l)^»es^\ 



Lemma 8. Fix J,S,T C [n]. Then 




if 3-ir £ Sj, irS = T 



E [xsix) ■ Xr(vrx)] = 
xe{o,i}",7re5j I otherwise 

Proof. For any vector x G {0,1}", any set S C [n], and any permutation vr G 5„, we have the 
identity Xsi^^x) = X7r--^s{x). So 



E [xs{x) ■ Xt{t^x)] = E [xs{x)xTr'^T{x)] = E 

a;6{0,l}",7r6cSj a;,7r tt 

But Fix[xsix)xTT~'^Tix)] = 1[S = TT~^T], SO we also have 



^[Xsix)X7r-^Tix)] 



E [xs{x) ■ xt{ttx)] = Fi[S = TT-^T] = Pr [vr^ = T] . 

The identity ttS = T holds iff the permutation vr satisfies 7r(i) G T for every i £ S. Since we only 
permute elements from J, the sets S and T must agree on the elements of [n] \ J. If this is not the 
case, or if the intersection of the sets with J is not of the same size, no such permutation exists. 
Otherwise, this event occurs if the elements of S n J are mapped to the exact locations of T n J. 
This holds for one out of the possible sets of locations, each with equal probability. □ 

Proof of Proposition\M By appealing to the fact that / is { — 1, l}-valued, we have that 

Pr[/(x) / /(vrx)] = i E [f{xf + f{7rxf - 2f{x)f{7Tx)]. 

Applying linearity of expectation and Parseval's identity, we obtain 

E [f{xf + fi-Kxf - 2f{x)f{-Kx)] = 2 V f{Sf - 2 V f{S)f{T) E [xs{x)xt{7^x)] . 

5C[n] S,T<^[n] 
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Fix any 5 C [n]. By Lemma [HI 

^ m E[xs{x)XT{7rx)] = = E [/(vr^)] . 

TC[n] .65, (|5nJ|) "^-^^ 

Given this equality, 

fiS)fiT) B[xs{x)xTi7rx)] = Y,fiS) E [/(vrS)] . 

By applying some elementary manipulation, we now get 

Pr[/(x) ^ /(vrx)] = lj]/(5)f/(5)-E[/>5)]^ 

^ J](E[/(7r5)VE[/>5)]2) = ^5;Var[/(vr5)] 



2 



□ 



A. 2 Monotonicity of symmetric influence 

Lemma [4] (Restated). For any function f : {0, 1}" {0, 1} and any sets J Q K Q [n], 

Symlnf^(J) < Symlnf^(K) . 

Proof. Fix a function / and two sets J,K CI [n] so that J C K. We have seen before that the 
symmetric influence can be computed in layers, where each layer is determined by the Hamming 
weight and the elements outside the set we are considering. Using the fact that Var(X) = Fr[X = 
0] •Pr[X = 1], the symmetric influence is twice the expected variance over all the layers (considering 
also the size of the layers). Using the same notation as before, 

Sy-Inf/(J) = ^EEl^7^J-2Var[/(x)|xGL^^J 

z w 

= 2 • E \var\f(x) I x G L^' ] . 

A key observation is that since K <^ J, the layers determined when considering J are a reflne- 
ment of the layers determined when considering K. Together with the fact that Var(X) = Pr[X = 
0] •Pr[X = 1] is a concave function in the range [0, 1], we can apply Jensen's inequality on each layer 
before and after the reflnement to get the desired inequality. More precisely, for every z S {0, Ij'-^l 
and < u; < n, 

Var[/(x) I X e L^^J > E [Var[/(.) | x G L^^^] \ y G L^^^ . 
Averaging this over all layers, we get the desired result. □ 
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A. 3 Weak sub-additivity of symmetric influence 

In this section we prove that symmetric influence satisfies weak sub-additivity. It might be tempting 
to think that strong sub-additivity holds, as in the standard notion of influence, however this is 
not the case. For example, consider the function f{x) = fi{xj) © f2{xK) for some partition 
[n] = J VJ K and two randomly chosen symmetric functions /i, /2- Since / is far from symmetric, 
Symlnfj([n]) = SymInfj(JU A') > while Symlnfj(J) = SymInf^(A:) = 0. 

The additive factor of in Lemma[5]is derived from the distance between the two distributions 
t^jvjkx and ttjttkx, for a random x G {0, 1}" and random permutations from Sj[jk,Sj,Sk- When 
the sets J and K are large, the distance between these distributions is relatively small which 
therefore result in this weak sub-additivity property. 

The analysis of the lemma is done using hyper geometric distributions, and the distance between 
them. Let T-Ln,m,k be the hyper geometric distribution obtained when we pick k balls out of n, m of 
which are red, and count the number of red balls we obtained. Let dTv(") ") denote the statistical 
distance between two distributions. The following two lemmas would be useful for our proof. 

Lemma 9. Let J,K CI [n] be two sets and 7r,7rj,7rx be permutations chosen uniformly at ran- 
dom from Sj\jk,<Sj,Sk, respectively. For a fixed x £ {0,1}", we define Vt^x and Dt^jt^^^x as the 
distribution of ttx and ttjttkx, respectively. Then, 

dTYiD.„x,D^j^^x) = dTY{7ilJuK\,\xjuK\,\K\J\i'H\K\,\xK\,\K\J\) 

holds. 

Lemma 10. Let n,m,n' ,m' ,k be non-negative integers with k,n' < for some 7 < ^. Suppose 
that \m — ^\ < t^/n and \m' — -y | < tVn/ hold for some t < iqq^ ■ Then, 

dTY{'Hn,m,k,y-n-n\m-m',k) < tfXn](l + ^7 • 

holds for some universal constant cjjq]. 

We first show how these lemmas imply the proof of Lemma [Sj and will afterwards prove them. 

Lemma [5] (Restated). There is a universal constant c such that, for any constant < 7 < 1, a 
function f : {0, 1}" — )■ {0, 1} and sets J,K CI [n] of size at least (1 — 7)n, 

Symlnf^( J U K) < Symlnf^( J) + Syminf ^(i^) + c^ . 

Proof. Let 7r,7rj and ttk be as in Lemma [9] and fix x G {0, 1}" to be some input. 

Pr[/(x) / /(^x)] < Pr [f{x)^ f{njnKx)] + dTy{V^x,V^,^^x) 

< Pr[f{x)^f{'ITKx)]+ Pr [filTKx) ^ f{Trj1TKx)]+dTy{'DT,x,'T>njnKx) 

By summing over all possible inputs x we have 

Symlnf^( J U K) = Pr [/(x) / /(vrx)] = ^Yl ^^t^^^) ^ ^(^^)] 

X 

< SymInf_^(J) + Symlnff^K) + ^ ^ dTv(^7rx, ^ttjttkx) • 

X 
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By applying Lemma [9] over each input x, it suffices to show that 

^^(^TY{T^7TX,T^nj7TKx) = ^^(^TYi'H\JUK\,\xjuK\,\K\J\,T-l-\K\,\xK\,\K\J\) < C\fl ■ (1) 

X X 

Ideally, we would like to apply Lemma [10] on every input x and get the desired result, however 
this is not possible as some inputs does not satisfy the requirements of the lemma. Therefore, we 
perform a slightly more careful analysis. Let us choose c > 2 and assume 7 < | (as otherwise 
the claim trivially holds). Fix 7' = 7/(1— 7) <^ and t = ^qq^ ■ We first note that regardless 
of X, the required conditions on the size of the sets hold. To be exact, \J\K\ < ^'\J L) K\ and 
1-?^ \ -''I < y\JLlK\ since | JU K| > (1 - -f)n and \ J\K\ < \K\ < -/n (and similarly |K \ J| < 7n). 

We say an input x is good if it satisfies the other conditions of Lemma [TOl That is, both 



\xjuk\ - 



< t-s/\JUK\ and 



xj\k\ — 



< ty^\J \ K\ hold. Otherwise we call such 



bad. Prom the Chernoff bound and the union bound, the probability that x is bad is at most 
4exp(— 2t^) < 4exp ^— -gofjoy^ < c'7 for some constant c' (notice that 7' < 27). 
By applying Lemma [TOl over the good inputs we get 

x-.bad x-.good 

for some constant c, as required. □ 

Proof of Lemma{^ Since both distributions Dj^^ and Dj^jt^^^x only modify coordinates in J U K, 
we can ignore all other coordinates. Moreover, it is in fact suffices to look only at the number of 
ones in the coordinates oi K \ J and J U K, which completely determines the distributions. Let 
Dz denote the uniform distribution over all elements y G {0, 1}" such that |y| = yj[jj^ = ^'J^Jk 
and = z (which also fixes the number of ones in yj). Notice that this is well defined only 

for values of z such that max{0, I^jukI — < z < min{|xjux|, \K \ J|}. 

Given this notation, D^^x can be looked at as choosing z ~ T~i\jvjK\,\xjuK\,\K\j\ and returning 
y ^ Dz- This is because we apply a random permutation over all elements of JU K, and therefore 
the number of ones inside K \ J is indeed distributed like z. Moreover, the order inside both sets 
K \ J and J is uniform. 

The distribution D^^jt^j^x can be looked at as choosing z ~ T~i\K\,\xK\,\K\j\ aiid returning y ~ D^. 
The number of ones in X \ J is determined already after applying ttk- It is distributed like z as 
we care about the choice of \K\ J\ out of the \K\ elements, and \xk\ of them are ones (and their 
order is uniform). Later, we apply a random permutation vrj over all other relevant coordinates, 
so the order of elements in J is also uniform. 

Since the distributions are disjoint for different values of 2, this implies that the distance 
between the two distributions D-j^x and -DjrjTrj^x depends only on the number of ones chosen to be 
inside K\J. Therefore we have 

^T\{D-^x,Dt,jt,j^x) = ^T\{T-i\JVjKl\xj,jKl\K\J\,'H\K\,\xKl\K\J\) 

as required. □ 

Proof of Lemma \1(K Our proof uses the connection between hyper geometric distribution and the 
binomial distribution, which we denote by Bn,p (for n experiments, each with success probability 
p). By the triangle inequality we know that 

dTY{7in,m,k,T-(-n-n',m-m.',k) < dTv('Wn,m,fc, I3k,p) + dTv(^fc,p) I3k,p') + ^TviBk^p' jTin^n' ,m~m' ,k) (2) 
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where p = ^ and p' = . In order to bound the distances we just introduced, we use the 

fohowing two lemmas. 

Lemma 11 (Example 1 in [29]). dT^y{Tin,m,k,Bk,p) < ^ holds for p = ^. 
Lemma 12 ([1]). Let < p < 1 and < 5 < 1 - p. Then, 



2 (l-r„,p(5))2 

provided Tn,p{S) = 5^ 2p{i~p) < ^■ 

Before using the above lemmas, we analyze some of the parameters. First, when A: = the lemma 
trivially holds and we therefore assume k > 1. Notice that this implies that n7 > k > 1. The 
probability p is known to be relatively close to half. To be exact, |p — || < t^fnjn < < ^ 



2 I — " V — 100v/n7 — 100 

and therefore p(i_p-^ < 6. Assume p < p' and let 6 = p' — p (the other case can be treated in the 
same manner). We first bound 6 as follows. 



^ mn' — nm' ^ 

n(n — n') ~ n{n 

t{n^/^+^n') ^ 2t^n3/2 1 

= 7 T\ -TT — r^-^*\/~ (from7<-) 

n(n — n') (1 — 7)n^ V 2 

Then, rfc_p((5) in Lemma [T2l can be bounded by 



< Vltsf^jkln < 12t7 (from 1 < /c < 771) . 
Note that, from the assumption, we have T^^piS) < ^. By Lemmas 1111 and 112^ we have 

(151) < k Je^kA^ I 

" - 2 (l-rfc,p(d))2 ^n-n' 

< 37 + 2^^ • 12t7 (from Tk,piS) < ^) 



for some universal constant qjQ] . □ 

B Testing partial symmetry 
B.l Analysis of Find- Asymmetric-Set 

In this section we prove there exists an algorithm Find-Asymmetric-Set, which satisfies Lemma[6j 
Suppose that we have two inputs x,y £ {0, 1}" with xj = yj, \x\ = \y\ such that f{x) ^ f{y)- 
Given such inputs, we know there exists some asymmetric variable outside of J. In order to 
efficiently find a set from a partition X which contains such a variable, we will use binary search 
over the sets. First, we construct a refinement J of X. Every set of X \ {VF} is partitioned further 
into parts so that each part has size at most [|VF|/4]. Let t = 1^7 \ {W}\ be the number of parts 
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in J excluding the workspace. Notice that the number of parts is at most t < r + 4:n/\W\ = 0{r). 
Then, we construct a series of inputs = x, , . . . , = y hy each step permuting only elements 
from some set I G J \ {W} and the workspace W (that is, applying a permutation from Sjuw)- 
In each such step, we guarantee that = yj for one more set / G i7\ {W}, and therefore after (at 
most) t steps we would reach y (notice that we can choose the last step such that = yw as the 
Hamming weight of all the inputs in the sequence is identical). 

Using this construction, we can now describe the algorithm Find-Asymmetric-Set as follows. 



Algorithm 4 Find-Asymmetric-Set(/,X, J, VF) 

Generate x G {0, 1}" and vr E Sj uniformly at random, 
if f{x) / /(vrx) then 
Define x^, . . . ,x^. 

Perform binary search on x = x^, . . . , x* = y, and find i such that f{x''~^) ^ f{x^)- 
return the only part I G X \ {W} such that x^^ ^ x\. 
return 0. 



Proof of Lemma\^ Since we perform binary search over the sequence x'', . . . , x*, the query com- 
plexity of the algorithm is indeed Oilogt) = O(logr). Also, it is easy to verify that we only output 
an empty set or a part in X \ {W} disjoint to J (as x j = yj). 

Two random inputs x and y := vrx, for tt G Sj, satisfy f{x) ^ f{y) with probability Symlnfj(J). 
Thus, it suffices to show that we can always define a sequence of x", . . . ,x*, given that \W\ > ^. 
In order to see this is always feasible, we consider the sequence after already defining x", . . . ,x*, 
showing we can define x*"*"^. 

Let J'^ = {I ^ •J \ \x\\ > \yi\} and J~ = {/ G J" | |x}| < denote the sets which require 
increasing or decreasing the Hamming weight of xw respectively, when applying a permutation from 
Sivjw to ensure x^^^ = yj. Notice that we ignore sets I for which |x}| = \yi\, as they do not impact 
the Hamming weight of x|y. IflJ'"'^! >Oand|J'"| > 0, then since max(|x|y|, — |x|y|) > [|VF|/2] 
and the size of every set I G J \ {W} is at most [|VK|/4], there must exists a set we can use to 
define x*+^. On the other hand, if \ = for example, then we can define x*^^ using any set 
from J'^ as |x|y| - \yw\ = - YjieJ\{W} l^/l ~ l^^l (^ecah that |x| = |x*| = 

It remains to show that when W contains no asymmetric variables and we output a part 
I £l \ {W}, I contains an asymmetric variable. Suppose that the output / is the part which was 
modified between x*~^ and x*. Then, since /(x*~^) ^ f[x'^),\x^~^\ = |x*|, and x^~^ and x* differ 
only on I L)W, an asymmetric variable exists in / U and we know it is not in P^. □ 

B.2 Proof of Lemma [7] 

We first note that when the number of parts r is bigger then n, we simply partition into the n 
single-element sets and the lemma trivially holds. For < t < 1, let J-j = {J C [n] : Symlnfj(J) < 
te, \J\ < 5kn/r} be the family of all sets which are not too big and whose complement has symmetric 
influence of at most te. (Notice that with high probability, the union of any k sets in the partition 
would have size smaller than 5kn/r, and therefore we assume this is the case from this point on.) 
Our first observation is that for small enough values of t, Tt is a (fc + l)-intersecting family. Indeed, 
for any sets J, K £ Ti/^, 

Symlnff{TnK) = SymInff(JUK) < Symlnf^(J) + Symlnf_^(;^) + c^/5k/r < 2e/3 + e/9 < e . 
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Since / is e-far from (n — A:)-symmetric, every set S C [n] of size jS*! < A: satisfies Symlnfj(S') > e. 

So \ JnK\> k. 

We consider two cases separately: when contains a set of size less than 2k; and when it 
does not. The first case is identical to the proof of Lemma [2] and hence we do not elaborate on it. 

In the second case, which also resembles the proof of Lemma [21 we claim that J-i/g is a 2k- 
intersecting family. If this was not the case, we could find sets J,K ^ -^1/9 such that | J n < 2k 
and SymInfj(Jn K) < Symlnfj(J) + Symlnfj(Er) + e/9 < e/3, contradicting our assumption. 

Let J C [n] be the union of k parts in X. Since X is a random partition, J is a random subset 
obtained by including each element of [n] in J independently with probability p = k/r < 2k+i ■ 
bound the probability that J contains some element from J-i/g, we define J^[^g to be all the sets 
that contain a member from J-i/g. Since is also a 2A;-intersecting family, by Theorem [H for 

every such J of size at most 5kn/r, Pr[SymInfj(J) < f ] = Pr[J £ < fJ-k/ri^i/g) ^ {k/r)^^ . 

Applying the union bound over all possible choices for k parts, / will not satisfy the condition of 
the lemma with probability at most (J^) (^) = 0{k~^), which completes the proof of the lemma. 

C Isomorphism testing and sampling partially symmetric func- 
tions 

C.l Properties of the sampling distribution 

We start this section with the following observation. When the number of parts r reaches n (or 
alternatively when k = Q{^/n)), we consider the partition of [n] into the n single-element sets. 
Notice that when this is the partition, then in fact is identical to 2?^ ^, making the following 
proposition trivial. Therefore, in the proof we assume that r < n and k = 0{^/n). 

Proof of Proposition d We start with the first part of the proposition, showing x is almost uniform. 
Consider the following procedure to generate a random X, W and x. We draw a random Hamming 
weight w ~ Bn.i/2 define x' to be the input consisting of w ones followed hy n — w zeros. We 
choose a random partition X' of [n] into r consecutive parts Ii, . . . ,Ir (i.e., Ii = {1, 2, . . . , and 
Ir = {n — \Ir \ + 1, . . . , n}) according to the typical distribution of sizes in a random partition. Let 
the workspace W be the only part which contains the coordinate w (or Ii if w = 0). We now apply 
a random permutation over x' , I' and W' to get x, X and W. 

It is clear the above procedure outputs a uniform x as we applied a random permutation over 
x' , which had a binomial Hamming weight. The choice of X was also done at random, considering 
the applied permutation over X'. The only difference is then in the choice of the workspace W, 
which can only be reflected in its size. However, when r = o{\/n) we will choose the middle part as 
the workspace with probability 1 — o(l), regardless of its size. In the remaining cases, since there 
are n/r = J7(-^/n) parts, the possible parts to be chosen as workspace are a small fraction among 
all parts, and therefore W would be o(l)-close to being a random part. 

Proving the second property of the proposition, we also consider two cases. When r = o{y/n), 
with probability 1 — o(l), the workspace would have size u}{^/n) and also w = n/2 + 0{^/n). In 
such a case, the r — 1 parts (excluding the workspace) would be half zeros and half ones, and the 
marginal distribution over the number of ones in J would be i,(r-i)/2,fc (assuming the elements 
of J are separated by X, which happens with probability 1 — o(l)). By Lemma [TT| the distance 
between this distribution and 13i^ i/2 is bounded by k/r < c/k for our choice of < c < 1. Since 
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there is no restriction on the ordering of the sets, this is also the distance from uniform over {0, 1}^ 
as required. 

In the remaining case where r = 0,{^/n), we can use the same arguments and also apply 
Lemma [12] with the distributions 3)^1/2 and Bf^ 1/2+5 for 6 = 0(l/\/n), implying the distance be- 
tween these two distributions is at most o(l). Combining this with the distance to ^r-i,(r-i){i/2+5),fc 
we get again a total distance of k/r + o(l) < c/k for our choice of < c < 1. □ 

C.2 Analysis of Partially-Symmetric-Isomorphism-Test 

The analysis of the algorithm is based on the fact that functions which passes the Partially- 
Symmetric-Test satisfy some conditions, and in particularly are closed to being partially sym- 
metric. We therefore start with the following lemma. 

Lemma 13. Let g be a function e-close to being (n — k) -symmetric which passed the Partially- 
SYMMETRic-TEST(g', /c, e). In addition, letI,W and J be the partition, workspace and identified 
parts used by the algorithm. With probability at least 9/10, there exists a function h which satisfies 
the following properties. 

• h is 4e-close to g, and 

• h is {n — k)- symmetric whose asymmetric variables are contained in J and separated by X. 

Proof. Let g* be the (n — A;)-symmetric function closest to g (which can be / itself, up-to some 
isomorphism) and R be the set of (at most) k asymmetric variables of g* . By Lemma [3] and our 
assumption over g, 

Symlnf^(i?) < 2 • d\si{g,g*) < 2e . 

Notice however that R is not necessarily contained in J and therefore g* is not a good enough 
candidate for h. Let U = RCi J he the intersection of the asymmetric variables of g* and the 
sets identified by the algorithm. In order to show that g is also close to being ?7-symmetric, 
we bound SymInfg(C/) using Lemma [5] with the sets R and J. Notice that since |-R| < k and 
I J| < 2kn/r < e^njd for our choice of c', we can bound the error term (in the notation of Lemma[5]) 
by < C\J /d < e. We therefore have 

SymInfg(I7) < Sjmlnig(R) + Symlnf^(J) + e < 2e + e + e = 4e 

where we know Symlnfg(J) < e with probability at least 19/20 as the algorithm did not reject. 

By applying Lemma [3] again, we know there exists a [/-symmetric function h, whose distance 
to g is bounded by dist{g,h) < 4e. Moreover, with probability at least 19/20, all its asymmetric 
variables are completely separated by the partition I (and they were all identified as part of J). □ 

Given Lemma [131 we are now ready to analyze Partially-Symmetric-Isomorphism-Test. 

Proof of Theorem Ul Before analyzing the algorithm we just described, we consider the case where 
k > n/10. Since Theorem [2] does not hold for such k's, we apply the basic algorithm of 0(nlogn/e) 
random queries, which is applicable testing isomorphism of any given function (since there are 
n! possible isomorphisms, the random queries will rule out all of them with good probability, 
assuming we should reject). Since k = Q{n), the complexity of this algorithm fits the statement of 
our theorem. 

We start by analyzing the query complexity of the algorithm. The step of Partially-Symmetric- 
Test performs log |) queries, and therefore the majority of the queries are performed at the 
sampling stage, resulting in 0(fclogfc/e^) queries as required. In order to prove the correctness of 
the algorithm, we consider the following cases. 
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• g is e-far from being isomorphic to / and e/lOOO-far from being (n — A:)-symmetric. 

• g is e-far from being isomorphic to / but e/lOOO-close to being (n — A;)-symmetric. 

• g is isomorphic to /. 

In the first case, with probabihty at least 9/10, Partially-Symmetric-Test wih reject and so 
will we, as required. We assume from this point on that Partially- Symmetric-Test did not 
reject, as it will only reject g which is isomorphic to / with probability at most 1/10, and that we 
are not in the first case. Notice that these cases match the conditions of Lemma \T3\ and therefore 
from this point onward we assume there exists an h satisfying the lemma's properties (remembering 
we applied the algorithm with e/1000). 

In order to bound the distance between h and g in our samples, we use Proposition [H indicating 

Pr [g{x) / h{x)] = dist(c/, h) + o(l/n) . 

By Markov's inequality, with probability at least 9/10, the partition I and the workspace W satisfy 
Pr [g{x) / h{x)] < 10 • dist(5, h) + o{l/n) < 10 • 4e/1000 + o(l/n) < e/20 . 

By Proposition [H if we were to sample h according to , it should be e/20-close to sampling 
its core (assuming the partition size is large enough). Combined with the distance between g and 
h in our samples, we expect our samples to be e/20 -|- e/20 = e/10 close to sampling /I's core. 

The last part of the proof is showing that there would be an almost consistent isomorphism of 
/ only when g is isomorphic to /. Notice however that we care only for isomorphisms which map 
the asymmetric variables of / to the k sets of J. Therefore, the number of different isomorphisms 
we need to consider is M. 

Assume we are in the second case and g is e-far from being isomorphic to /. Let 7,^ be some 
isomorphism of /. By our assumptions and Lemma [T3l 

dist(/^, h) > dist(/^, g) - dist(5r, h) > e - e/250 . 

Each sample we perform would be inconsistent with f^^ with probability at least e — e/250 — e/10 > 
8e/9. By the Chernoff bounds and the union bound, if we would perform q = 0{k log k/e^) queries, 
we would rule out all k\ possible isomorphisms with probability at least 9/10 and reject the function 
as required. 

On the other hand, if g is isomorphic to /, then we know there exists with probability at least 
9/10 some isomorphism /^r which maps the asymmetric variables of / into the sets of J, such that 

dist(/^, h) < dist{U,g) + dist(c/, h) < e/500 + e/250 . 

For this isomorphism, with high probability much more than (1 — e/2)-fraction of the queries would 
be consistent and we would therefore accept g as we should. □ 

C.3 Efficient sampler for partially symmetric functions 

We first provide the algorithm for efficiently generating a (5-sampler for partially symmetric func- 
tions. The algorithm perform its preprocessing by calling Partially-Symmetric-Test. Given 
the output of the algorithm, we query the function once for each call to the sampler, according to 
T^Y , and return the result. 
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Algorithm 5 Partially-Symmetric-Sampler(/, A;, 5, ?y) 
1: Perform Partially-Symmetric-Test(/, fc, r/5). 

2: Let X and W £ I he the partition and workspace used by the algorithm. 

3: Let J be the union of k parts in X \ {W} that were identified by the algorithm. 

4: Return the following sampler: 

5: Choose a random y ~ 

6: Let X S {0, l}'^ be the value assigned to the parts in J 
7: Yield the triplet {x,\y\ — \x\,f{y)) 



Proof of Theorem The algorithm for generating the sampler is described by Partially- Symmetric- 
Sampler, which performs log ^) preprocessing queries to the function. What remains to be 
proved is that indeed with good probability, the algorithm returns a valid sampler. 

Let h be the function defined in the analysis of Theorem [H which satisfies the conditions of 
Lemma [T31 Recall that its asymmetric variables were separated by I and appear in J. Following 
this analysis and that of Partially-Symmetric-Test, one can see that with probability at least 
1 — ?7 we would not reject / when calling Partially-Symmetric-Test. Moreover, the samples 
would be (^/2-close to sampling the core of h, which is by itself 5/2-close to /. Therefore, overall 
our samples would be (5-close to sampling the core of /. 

The last part in completing the proof of the theorem is showing that we sample the core with 
distribution (^-close to P^. „. By Proposition [H the total variation distance between sampling the 
core according to ^ and sampling it according to is at most c/k for our choice of < c < 1, 
which we can choose it to be at most 5. □ 

Notice that if the function / is not (n — fc)-symmetric but still very close (say (k/rjS)'^ -close), 
applying the same algorithm will provide a good sampler for an (n — A;)-symmetric function /' close 
to /. The main reason is that most likely, we will not query any location of the function where it 
does not agree with /'. 
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